AITopics | recovery function

Collaborating Authors

recovery function

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

9a093d729036a5bd4736e03c5d634501-Paper.pdf

Neural Information Processing SystemsFeb-13-2026, 03:03:21 GMT

algorithm, bandit problem, recovery function, (13 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.04)
Europe > United Kingdom > England > Lancashire > Lancaster (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Asia > Japan > Kyūshū & Okinawa > Kyūshū > Fukuoka Prefecture > Fukuoka (0.04)

Technology:

Information Technology > Data Science > Data Mining > Big Data (0.49)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.46)

Add feedback

Reviews: Recovering Bandits

Neural Information Processing SystemsJan-25-2025, 23:18:58 GMT

The major point that needs to be clarified for me is the distinction between single play regret and multiple play regret. The UCB-Z algorithm is not of this type for example. It is a bit weird to have regret notion that only apply to very specific algorithms - second, regarding the distinction between single and multiple play regret, am I right that E[R_T {(d,m)}] E[R_T {(d)}]? That is, you would in principle care about the multiple play lookahead regret, as one should be allowed to be play an arm multiple times during the d-time step on which we optimize. From my understanding, it would be defined only for strategies which selects each arm at most once during d steps (and therefore d cannot be larger than K in this case, right?).

algorithm, play regret, recovery function, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.33)

Add feedback

Dynamic Planning and Learning under Recovering Rewards

Simchi-Levi, David, Zheng, Zeyu, Zhu, Feng

arXiv.org Machine LearningJun-28-2021

Motivated by emerging applications such as live-streaming e-commerce, promotions and recommendations, we introduce a general class of multi-armed bandit problems that have the following two features: (i) the decision maker can pull and collect rewards from at most $K$ out of $N$ different arms in each time period; (ii) the expected reward of an arm immediately drops after it is pulled, and then non parametrically recovers as the idle time increases. With the objective of maximizing expected cumulative rewards over $T$ time periods, we propose, construct and prove performance guarantees for a class of "Purely Periodic Policies". For the offline problem when all model parameters are known, our proposed policy obtains an approximation ratio that is at the order of $1-\mathcal O(1/\sqrt{K})$, which is asymptotically optimal when $K$ grows to infinity. For the online problem when the model parameters are unknown and need to be learned, we design an Upper Confidence Bound (UCB) based policy that approximately has $\widetilde{\mathcal O}(N\sqrt{T})$ regret against the offline benchmark. Our framework and policy design may have the potential to be adapted into other offline planning and online learning applications with non-stationary and recovering rewards.

dynamic planning and learning, kleinberg & immorlica, time period, (13 more...)

arXiv.org Machine Learning

2106.14813

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Hawaii (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
(3 more...)

Genre: Research Report (0.50)

Industry: Education (0.34)

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Recovering Bandits

Pike-Burke, Ciara, Grünewälder, Steffen

arXiv.org Machine LearningOct-31-2019

We study the recovering bandits problem, a variant of the stochastic multi-armed bandit problem where the expected reward of each arm varies according to some unknown function of the time since the arm was last played. While being a natural extension of the classical bandit problem that arises in many real-world settings, this variation is accompanied by significant difficulties. In particular, methods need to plan ahead and estimate many more quantities than in the classical bandit setting. In this work, we explore the use of Gaussian processes to tackle the estimation and planing problem. We also discuss different regret definitions that let us quantify the performance of the methods. To improve computational efficiency of the methods, we provide an optimistic planning approximation. We complement these discussions with regret bounds and empirical studies.

algorithm, recovery function, sequence, (15 more...)

arXiv.org Machine Learning

1910.14354

Country:

North America > Canada (0.04)
Europe > United Kingdom > England > Lancashire > Lancaster (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Asia > Japan > Kyūshū & Okinawa > Kyūshū > Fukuoka Prefecture > Fukuoka (0.04)

Genre: Research Report (0.63)

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback